Skip to content

First version of pridepy download module#31

Merged
ypriverol merged 7 commits intomainfrom
dev
Apr 28, 2026
Merged

First version of pridepy download module#31
ypriverol merged 7 commits intomainfrom
dev

Conversation

@ypriverol
Copy link
Copy Markdown
Member

@ypriverol ypriverol commented Apr 28, 2026

Pull Request

Description

Checklist

  • Module follows nf-core standards
  • main.nf includes process definition
  • meta.yml includes complete documentation
  • environment.yml specifies dependencies
  • Tests are included
  • Code is formatted (prettier)
  • CI checks pass

Module Type

  • New module
  • Module update
  • Bug fix
  • Documentation

Related Issues

Closes #

Summary by CodeRabbit

  • New Features

    • Added a pridepy module to download spectrum files and query metadata from PRIDE Archive, supporting multiple transfer methods and producing spectra, checksum (optional), and versions outputs.
  • Tests

    • Added tests and Nextflow config to validate stubbed runs with and without checksum checking and to verify produced artifacts.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 021aabb0-2de2-44d6-be9c-94502a0b7d77

📥 Commits

Reviewing files that changed from the base of the PR and between 8cd0e28 and 49e1cc6.

⛔ Files ignored due to path filters (1)
  • modules/bigbio/pridepy/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (4)
  • modules/bigbio/pridepy/main.nf
  • modules/bigbio/pridepy/meta.yml
  • modules/bigbio/pridepy/tests/main.nf.test
  • modules/bigbio/pridepy/tests/nextflow.config
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/bigbio/pridepy/meta.yml
  • modules/bigbio/pridepy/tests/main.nf.test

📝 Walkthrough

Walkthrough

Adds a new pridepy module: Nextflow process for downloading PRIDE Archive data using the pridepy client (pinned v0.0.14), a Conda environment file, module metadata, and tests including stub-mode behavior and checksum handling.

Changes

Cohort / File(s) Summary
Core Module Implementation
modules/bigbio/pridepy/environment.yml, modules/bigbio/pridepy/main.nf, modules/bigbio/pridepy/meta.yml
Adds PRIDEPY_DOWNLOAD Nextflow process with Conda env (pins pridepy==0.0.14), container selection logic, optional execution gating, runtime task.ext.args, version detection via Python importlib.metadata, and declared spectra, checksums, and versions outputs.
Test Suite & Config
modules/bigbio/pridepy/tests/main.nf.test, modules/bigbio/pridepy/tests/nextflow.config
Adds two stub-mode tests (with/without --checksum-check) asserting spectra/checksum outputs and snapshots; custom publishDir behavior and process-targeted ext.args wiring in test config.

Sequence Diagram

sequenceDiagram
    participant Workflow
    participant PRIDEPY_DOWNLOAD
    participant PridepyEnv as "pridepy\n(Conda)"
    participant FS as "FileSystem\noutputs"

    Workflow->>PRIDEPY_DOWNLOAD: submit meta { id, ... }
    alt task.ext.when == null or true
        PRIDEPY_DOWNLOAD->>PridepyEnv: create/activate env (pridepy v0.0.14)
        PRIDEPY_DOWNLOAD->>PridepyEnv: run pridepy with task.ext.args
        PridepyEnv-->>PRIDEPY_DOWNLOAD: downloaded files
        PRIDEPY_DOWNLOAD->>FS: write spectra files, optional checksums
    else stub mode
        PRIDEPY_DOWNLOAD->>FS: create stub .raw and optional checksum
    end
    PRIDEPY_DOWNLOAD->>PridepyEnv: query installed pridepy version
    PridepyEnv-->>PRIDEPY_DOWNLOAD: version string or unknown
    PRIDEPY_DOWNLOAD->>FS: write versions.yml
    PRIDEPY_DOWNLOAD-->>Workflow: emit (meta, spectra?, checksums?, versions.yml)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

Review effort 3/5

Suggested reviewers

  • jpfeuffer
  • timosachsenberg
  • daichengxin

Poem

🐰 A pridepy module hops to life,
Fetching PRIDE files through network and strife,
Versions recorded, checksums may keep,
Stubbed or real, outputs sorted neat,
I nibble carrots and celebrate this new stride!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'First version of pridepy download module' directly and accurately describes the main change: introduction of a new pridepy module for downloading data from PRIDE Archive, with supporting configuration and test files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 28, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
modules/bigbio/pridepy/tests/nextflow.config (1)

2-2: Avoid potential publishDir collisions if this module grows to multiple processes.

Line 2 keeps only the first _ token from the process name, so similarly prefixed process names would publish into the same folder. Using the full normalized process name is safer.

Proposed tweak
-process {
-    publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }
-}
+process {
+    publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].toLowerCase()}" }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/bigbio/pridepy/tests/nextflow.config` at line 2, The publishDir
currently truncates the process name to only the first '_' token (in the
expression on publishDir), which can cause collisions; update the publishDir
expression to use the full normalized process name: take the task.process token
after the ':' (task.process.tokenize(':')[-1]), convert it to lowercase, and
normalize it (replace or collapse non-alphanumeric characters to underscores)
instead of tokenizing to the first '_' piece so each unique process gets its own
folder under params.outdir.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/bigbio/pridepy/tests/main.nf.test`:
- Around line 23-26: The test currently only asserts process.out.versions; add
an assertion that validates the emitted download_dir output from the workflow
(e.g., assert snapshot(process.out.download_dir).match("download_dir_stub") or
an equivalent existence/structure check) so regressions to the download_dir
output from modules/bigbio/pridepy/main.nf are caught; update the test block
that references process and snapshot to include this download_dir assertion
alongside the existing versions assertion.

---

Nitpick comments:
In `@modules/bigbio/pridepy/tests/nextflow.config`:
- Line 2: The publishDir currently truncates the process name to only the first
'_' token (in the expression on publishDir), which can cause collisions; update
the publishDir expression to use the full normalized process name: take the
task.process token after the ':' (task.process.tokenize(':')[-1]), convert it to
lowercase, and normalize it (replace or collapse non-alphanumeric characters to
underscores) instead of tokenizing to the first '_' piece so each unique process
gets its own folder under params.outdir.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 403d3c1d-5d29-403f-b16f-5b4e0443989b

📥 Commits

Reviewing files that changed from the base of the PR and between 226b107 and 8cd0e28.

⛔ Files ignored due to path filters (1)
  • modules/bigbio/pridepy/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (5)
  • modules/bigbio/pridepy/environment.yml
  • modules/bigbio/pridepy/main.nf
  • modules/bigbio/pridepy/meta.yml
  • modules/bigbio/pridepy/tests/main.nf.test
  • modules/bigbio/pridepy/tests/nextflow.config

Comment on lines +23 to +26
then {
assert process.success
assert snapshot(process.out.versions).match("versions_stub")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Stub test does not validate the download_dir output contract.

Line 25 only checks versions. Since modules/bigbio/pridepy/main.nf (Line 15) also emits download_dir, regressions there can pass unnoticed.

Suggested assertion addition
         then {
             assert process.success
+            assert snapshot(process.out.download_dir).match("download_dir_stub")
             assert snapshot(process.out.versions).match("versions_stub")
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
then {
assert process.success
assert snapshot(process.out.versions).match("versions_stub")
}
then {
assert process.success
assert snapshot(process.out.download_dir).match("download_dir_stub")
assert snapshot(process.out.versions).match("versions_stub")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/bigbio/pridepy/tests/main.nf.test` around lines 23 - 26, The test
currently only asserts process.out.versions; add an assertion that validates the
emitted download_dir output from the workflow (e.g., assert
snapshot(process.out.download_dir).match("download_dir_stub") or an equivalent
existence/structure check) so regressions to the download_dir output from
modules/bigbio/pridepy/main.nf are caught; update the test block that references
process and snapshot to include this download_dir assertion alongside the
existing versions assertion.

Comment thread modules/bigbio/pridepy/main.nf Outdated
fix(pridepy): drop output/ subdir, write to workDir
@ypriverol ypriverol merged commit f8c10a9 into main Apr 28, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants